Cache Conscious Column Organization in In-Memory Column Stores
نویسندگان
چکیده
Cost models are an essential part of database systems, as they are the basis of query performance optimization. Based on predictions made by cost models, the fastest query execution plan can be chosen and executed or algorithms can be tuned and optimized. In-memory databases shift the focus from disk to main memory accesses and CPU costs, compared to disk based systems where input and output costs dominate the overall costs and other processing costs are often neglected. However, modeling memory accesses is fundamentally different and common models do not apply anymore. This work presents a detailed parameter evaluation for the plan operators scan with equality selection, scan with range selection, positional lookup and insert in in-memory column stores. Based on this evaluation, we develop a cost model based on cache misses for estimating the runtime of the considered plan operators using different data structures. We consider uncompressed columns, bit compressed and dictionary encoded columns with sorted and unsorted dictionaries. Furthermore, we discuss tree indices on the columns and dictionaries. Finally, we consider partitioned columns consisting of one partition with a sorted and one with an unsorted dictionary. New values are inserted in the unsorted dictionary partition and moved periodically by a merge process to the sorted partition. We propose an efficient merge algorithm, supporting the update performance required to run enterprise applications on read-optimized databases and provide a memory traffic based cost model for the merge process.
منابع مشابه
Cache-Conscious Radix-Decluster Projections
As CPUs become more powerful with Moore’s law and memory latencies stay constant, the impact of the memory access performance bottleneck continues to grow on relational operators like join, which can exhibit random access on a memory region larger than the hardware caches. While cache-conscious variants for various relational algorithms have been described, previous work has mostly ignored (the...
متن کاملDesign and Evaluation of Storage Organizations for Read-Optimized Main Memory Databases
Existing main memory data processing systems employ a variety of storage organizations and make a number of storagerelated design choices. The focus of this paper is on systematically evaluating a number of these key storage design choices for main memory analytical (i.e. read-optimized) database settings. Our evaluation produces a number of key insights: First, it is always beneficial to organ...
متن کاملApplication-Specific Memory Management for Embedded Systems
We propose a methodology to improve the performance of embedded processors running data-intensive applications by allowing embedded software to manage on-chip memory on an application-specific or task-specific basis. We provide this management ability with a novel hardware mechanism, column caching. Column caching provides software with the ability to dynamically partition the cache. Data can b...
متن کاملPositional Delta Trees for Updating Column Stores
In this talk we introduce a datastructure, the positional delta tree (PDT), for maintaining differential updates against a read-optimized, column-oriented, relational DBMS, and present techniques for merging such updates into an ongoing table scan, to provide an up-to-date view of the data stored in the database. Traditionally, transaction processing systems have been built on top of row-orient...
متن کاملReduction in Cache Memory Power Consumption based on Replacement Quantity
Today power consumption is considered to be one of the important issues. Therefore, its reduction plays a considerable role in developing systems. Previous studies have shown that approximately 50% of total power consumption is used in cache memories. There is a direct relationship between power consumption and replacement quantity made in cache. The less the number of replacements is, the less...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013